Vectors


In [2]:
x=c(2,7,5)

In [3]:
x


Out[3]:
  1. 2
  2. 7
  3. 5

In [4]:
seq(from=4, length=3, by=3)


Out[4]:
  1. 4
  2. 7
  3. 10

In [5]:
y=seq(from=4, length=3, by=3)

Operations on vectors (run in parallel)


In [7]:
x+y


Out[7]:
  1. 6
  2. 14
  3. 15

In [8]:
x*y


Out[8]:
  1. 8
  2. 49
  3. 50

In [9]:
x^y


Out[9]:
  1. 16
  2. 823543
  3. 9765625

R starts counting at 1, not 0


In [10]:
x[2] # second element of x


Out[10]:
7

In [12]:
x[2:3] # all elements from the second, up to and including the third element


Out[12]:
  1. 7
  2. 5

In [15]:
x[-2] # all elements, except for the second one


Out[15]:
  1. 2
  2. 5

In [17]:
x[-c(1,2)] # all elements, except for the first and second one


Out[17]:
5

Matrices


In [21]:
z = matrix(seq(1,12), 4, 3) # matrix with 4 rows and 3 columns

In [22]:
z


Out[22]:
159
2 610
3 711
4 812

In [23]:
z[3:4, 2:3] # show rows 3 to 4, and cols 2 to 3


Out[23]:
711
812

In [25]:
z[,2:3] # show all rows, but only cols 2 to 3


Out[25]:
59
610
711
812

In [31]:
z[,1] # first col only, by default converted into a vector


Out[31]:
  1. 1
  2. 2
  3. 3
  4. 4

In [33]:
z[,1, drop=FALSE] # first col only, keep it as a matrix


Out[33]:
1
2
3
4

Information about objects


In [34]:
dim(z)


Out[34]:
  1. 4
  2. 3

In [35]:
ls() # known variables in your namespace(s)


Out[35]:
  1. 'x'
  2. 'y'
  3. 'z'

In [36]:
rm(y) # remove a variable

In [37]:
ls()


Out[37]:
  1. 'x'
  2. 'z'

generate random data


In [40]:
x=runif(50) # 50 random uniform numbers between 0 and 1

In [42]:
y=rnorm(50) # 50 random normal vars between 0 and 1

In [44]:
plot(x,y)



In [48]:
# change plotting character / color; add axis labels
plot(x,y,xlab='Random Uniform', ylab='Random Normal', pch='*', col='blue')



In [52]:
# par() sets global plotting params
# mfrow: plot with 2 rows and 1 col
par(mfrow = c(2,1))

In [53]:
plot(x,y)
hist(y)



In [54]:
# reset plot command
par(mfrow = c(1,1))

Reading in data


In [55]:
ls()


Out[55]:
  1. 'x'
  2. 'y'
  3. 'z'

In [58]:
Auto=read.csv('data/Auto.csv')

In [60]:
head(Auto)


Out[60]:
mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginname
1188307130350412701chevrolet chevelle malibu
2158350165369311.5701buick skylark 320
3188318150343611701plymouth satellite
4168304150343312701amc rebel sst
5178302140344910.5701ford torino
6158429198434110701ford galaxie 500

In [61]:
names(Auto)


Out[61]:
  1. 'mpg'
  2. 'cylinders'
  3. 'displacement'
  4. 'horsepower'
  5. 'weight'
  6. 'acceleration'
  7. 'year'
  8. 'origin'
  9. 'name'

In [62]:
dim(Auto)


Out[62]:
  1. 397
  2. 9

In [63]:
class(Auto)


Out[63]:
'data.frame'

In [64]:
summary(Auto)


Out[64]:
      mpg          cylinders      displacement     horsepower      weight    
 Min.   : 9.00   Min.   :3.000   Min.   : 68.0   150    : 22   Min.   :1613  
 1st Qu.:17.50   1st Qu.:4.000   1st Qu.:104.0   90     : 20   1st Qu.:2223  
 Median :23.00   Median :4.000   Median :146.0   88     : 19   Median :2800  
 Mean   :23.52   Mean   :5.458   Mean   :193.5   110    : 18   Mean   :2970  
 3rd Qu.:29.00   3rd Qu.:8.000   3rd Qu.:262.0   100    : 17   3rd Qu.:3609  
 Max.   :46.60   Max.   :8.000   Max.   :455.0   75     : 14   Max.   :5140  
                                                 (Other):287                 
  acceleration        year           origin                  name    
 Min.   : 8.00   Min.   :70.00   Min.   :1.000   ford pinto    :  6  
 1st Qu.:13.80   1st Qu.:73.00   1st Qu.:1.000   amc matador   :  5  
 Median :15.50   Median :76.00   Median :1.000   ford maverick :  5  
 Mean   :15.56   Mean   :75.99   Mean   :1.574   toyota corolla:  5  
 3rd Qu.:17.10   3rd Qu.:79.00   3rd Qu.:2.000   amc gremlin   :  4  
 Max.   :24.80   Max.   :82.00   Max.   :3.000   amc hornet    :  4  
                                                 (Other)       :368  

In [67]:
plot(Auto$cylinders, Auto$mpg)



In [69]:
# attributes can be a abbreviated
# plot(Auto$cyl, Auto$mpg)

In [73]:
attach(Auto) # put column header names into current namespace


The following objects are masked from Auto (pos = 3):

    acceleration, cylinders, displacement, horsepower, mpg, name,
    origin, weight, year


In [74]:
search()


Out[74]:
  1. '.GlobalEnv'
  2. 'Auto'
  3. 'Auto'
  4. 'package:stats'
  5. 'package:graphics'
  6. 'package:grDevices'
  7. 'package:utils'
  8. 'package:datasets'
  9. 'package:methods'
  10. 'Autoloads'
  11. 'package:base'

In [96]:
par(mfrow = c(2,2))
plot(cylinders)
hist(cylinders)
plot(as.factor(cylinders), main="Factors of cylinders")
plot(as.factor(cylinders), mpg, main="Boxplot: cyl. vs. mpg",
     xlab="cylinders", ylab="mpg")